Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
💾 Prompt Caching
Context Reuse, KV Cache, Inference Optimization, Token Efficiency
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
32351
posts in
13.0
ms
Disaggregated
Prefill
and Decode
research.perplexity.ai
·
13h
🔮
Prefetching
Zero-Waste Agentic RAG: Designing
Caching
Architectures to
Minimize
Latency and LLM Costs at Scale
towardsdatascience.com
·
1d
⏳
Lazy Loading
Memory
Caching
:
RNNs
with Growing Memory
arxiv.org
·
1d
🧠
LLM Inference
Async
in
depth
tokio.rs
·
18h
🔄
Async Rust
Context
Engineering vs
Prompt
Engineering
newsletter.systemdesign.one
·
1h
·
Discuss:
r/programming
🪄
Prompt Engineering
Show HN: Personal AI
gateway
for
OpenClaw
github.com
·
7h
·
Discuss:
Hacker News
🤖
Agent Payments
Simulating
Queueing
buttondown.com
·
16h
📅
Resource Scheduling
🐥 Optimizing
nested
array operations in PHP: from O(
3n
) to O(n)
yellowduck.be
·
1d
🏹
Apache Arrow
The AI
Efficiency
Survey
sambanova.ai
·
1d
🏆
LLM Benchmarking
🎲 How to Cache and
De-duplicate
Fetch
Requests
deliciousreverie.co.uk
·
1d
⏳
Lazy Loading
Efficient and
Portable
Mixture-of-Experts
Communication
research.perplexity.ai
·
13h
🧠
Inference Serving
i3T4AN/Semantic-skill-space
: Semantic Skill Space (
SSS
): Injecting skill embeddings into KV cache for small-model agents to recover skill behavior with lower prompt-context overhead.
github.com
·
1d
·
Discuss:
r/LocalLLaMA
🎨
Chroma
Agentic
Engineering: Building Without
Writing
dehora.net
·
2h
·
Discuss:
Hacker News
🪄
Prompt Engineering
The
Architecture
Behind Open-Source LLMs
blog.bytebytego.com
·
18h
🏗️
LLM Infrastructure
Supreme Court
ducks
AI
copyright
question
therundown.ai
·
1h
🛡️
AI Security
Step One to Using AI Well: Stop Using ChatGPT
yage.ai
·
6h
👨💻
AI Coding
KEEP: A
KV-Cache-Centric
Memory Management System for Efficient
Embodied
Planning
arxiv.org
·
1d
💨
Cache-Friendly Algorithms
I built a
persistent
memory
layer
for AI agents in Rust
news.ycombinator.com
·
15h
·
Discuss:
Hacker News
🔎
Tantivy
IaC
in AWS with
canned
templates, price estimates, low-code, CLI
stacktape.com
·
20h
🌟
Datastar
Communication-efficient Distributed
Statistical
Inference for Massive Data with Heterogeneous
Auxiliary
Information
jmlr.org
·
22h
🧠
Inference Serving
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help